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ABSTRACT 

Covariance inflation and localization are two important techniques that are used to 
improve the performance of the ensemble Kalman filter (EnKF) by (in effect) adjusting 
the sample covariances of the estimates in the state space. In this work an additional 
auxiliary technique, called residual nudging, is proposed to monitor and, if necessary, 
adjust the residual norms of state estimates in the observation space. In an EnKF 
with residual nudging, if the residual norm of an analysis is larger than a pre-specified 
value, then the analysis is replaced by a new one whose residual norm is no larger than 
a pre-specified value. Otherwise the analysis is considered as a reasonable estimate and 
no change is made. A rule for choosing the pre-specified value is suggested. Based on 
this rule, the corresponding new state estimates are explicitly derived in case of linear 
observations. Numerical experiments in the 40-dimensional Lorenz 96 model show that 
introducing residual nudging to an EnKF may improve its accuracy and/or enhance 
its stability against filter divergence, especially in the small ensemble scenario. 

31 Both covariance inflation and localization are tech- 

niques that in effect adjust the sample covariances in the 
state space. Since data assimilation is a practice of estima- 
tion that incorporates information from both the state and 
observation spaces, it would be natural for one to make use 
of the information in the observation space to improve the 
performance of an EnKF. 

In this study we propose such an observation-space 
based auxiliary technique, called residual nudging, for the 
EnKF. Here a "residual" is a vector in the observation space, 
and is defined as the projection of an analysis mean onto 
the observation space subtracted from the corresponding ob- 
servation. In residual nudging our objective is to make the 
vector norm of the residual ("residual norm" for short) no 
larger than a pre-specified value. This is motivated by the 
observation that, if the residual norm is too large, then the 
corresponding analysis mean is often a poor estimate. In 
such cases, it is better off to choose as the new estimate a 
state vector whose residual norm is smaller. 

The method prese nted in this work is close to the idea 
of I Van Leeuwenl (|2010T ). in which a nudging term is added 
to the particle filter so that the projections of the particles 
onto the observation space are drawn closer to the corre- 
sponding observation, and the particles themselves are asso- 
ciated with almost equal weights. By doing so, the modified 
particle filter can achieve remarkably good performance us- 
ing only 20 p articles in the chaotic 40-dim ensional Lorenz-96 
(L96) model (|Lorenz and Emanuel [l998h . w hile traditional 
meth ods may need thousands of particles (|Van Leeuwenl . 
[20101 ). Other similar, residual-related, m ethods we r e also 
found in the liter a ture, f or examples, see I Anderson! (|2007l ; 
120091 ); ISong et al.l (|2010h . lAndersonl (|2007l ; l2009h "lmggested 
adaptive covariance inflation schemes in the context of hi- 
erarchical ensemble filtering. There the inflation factor A is 



for data assimilation in high dimensional systems. Because 
of its runtime efficiency and simplicity in implementation, 
it is receiving ever-increasing attentions from researchers 
in various fields. In many applications of the EnKF, due 
to limited computational resources, one is only able to run 
an EnKF with an ensemble size much smaller than the di- 
mension of the state space. In such circumstances, problems 
often arise, noticeably on the quality of the sample covari- 
ances, including, for instance, ra nk-deficiency, underestima- 
tion of the covariance matri ces (|Sacher and Bartellol . 120081 ; 
IWhitaker and Hamill l2002h . and spuriously large cross- 
varian ces between indepe ndent (or uncorrelated) state vari- 
ables (|Hamill et al.l . [200lh ■ To mitigate these problems, it is 
customary to intr oduce two auxiliary techniques, namely co- 
variance anfhatioiij^mde^ and local- 
ization (|Hamill et al. . 2001 ), to the EnKF. On the one hand, 
covariance inflation increases the estimated sample covari- 
ances in order to compensate for the effect of underestima- 
tion, which in fact increases the ro bustness of the EnKF in 
the sense of iLuo and Hoteitl | 201 ll ). On the other hand, co- 
variance localization introduces a "distance" -dependent ta- 
pering function to the elements of the sample covariances, 
and smooths out the spuriously large values in them. In ad- 
dition, covariance lo calization also incre ases the ranks of the 
sample covariances (|Hamill et all 120091 ). 



* Corresponding author, 
e-mail: xiaodong.luo@iris.no 



© 0000 Tellus 



2 BY X. LUO AND I. HOTEIT 



102 
103 



considered as a random variable (with a presumed initial 
prior distribution), and in effect adjusts the projection of 
the background (co)variances onto the observation spac^E 
With an incoming observation, the prior distribution is up- 
dated to the posterior one based on Bayes' rule, while the 
residual affects the s hape of the poster ior distribution of A. 
On the other hand, ISong et al.l (|2010h considered the idea 
of replacing an existing analysis ensemble member by a new 
one, in which the residual plays a role in generating the new 
ensemble member. 

Our main purpose here is to use residual nudging as a 
safeguard strategy, with which the projections of state esti- 
mates onto the observation space, under suitable conditions, 
are guaranteed to be within a pre-specified distance to the 
corresponding observations. We will discuss how to choose 
the pre-specified distance, and construct the (possibly) new 
state estimates accordingly in case of linear observations. In 
this work, the en semble adjustment Kalman filter (EAKF) 
(Anderson, 2001) is adopted for the purpose of demonstra- 
tion, while the extension to other filters can be done in a sim- 
ilar way. Through numerical experiments in the L96 model, 
we show that, the EAKF equipped with residual nudging 
(EAKF-RN) is more robust than the normal EAKF. In ad- 
dition, the accuracy of the EAKF-RN is comparable to, and 
sometimes (much) better than, that of the normal EAKF. 

This work is organized as follows. Section [2] reviews the 
filtering step of the EAKF, introduces the concept of resid- 
ual nudging, and discusses how it can be implemented in 
the EAKF. Section[3]investigates the effect of residual nudg- 
ing on the performance of the Kalman filter (KF) in a lin- 
ear/Gaussian system, which aims to provide some insights 
of how residual nudging may affect the behaviour of an al- 
ready optimal filter. Section 0] extends the investigation to 
the Lorenz 96 model, in which we examine the performance 
of the EAKF-RN in various scenarios, and compare it with 
the normal EAKF. Section discusses possible extensions 
of the current study and concludes the work. 



Ensemble Kalman filtering with residual 
nudging 



104 Suppose that at the kih assimilation cycle, one has a 

105 background ensemble = {x^ with n members. The 

106 incoming observation y k is obtained from the following ob- 

107 servation system 



y k = H fc x fc + v fc , 



(1) 



109 where is a matrix, and vj, is the observation noise, with 

no zero mean and covariance Rfc . For convenience of discussion, 

in we assume that the dimensions of Xfe and yk are m x and m y , 

ii2 respectively, m y ^ m x , and has full row rank. 



153 
154 
155 
156 



2. 1 The filtering step of the ensemble adjustment Kalman 
filter with covariance inflation and localization 

We first summarize the filtering step of the EAKF with 
both covariance inflation and localization. For simplicity, 
here we only consider the scenario with constant covariance 
inflation and localizati on, and refer readers to, for example, 
lAndersonl |2007l ; [20091 ) . for the details of adaptive configu- 
ration of the EAKF. In the context of EAKF, it is assumed 
that the covariance Rfe of the observation noise is a diagonal 
matrix, such that one can as similate the incoming observa- 
tion in a serial way. Following lAndersonl (|2007l ; l2009l ). we use 
a single scalar observation to demonstrate the assimilation 
algorithm in the EAKF. To this end, in this sub-section 
(only) we temporarily assume that the observation vector 
yk = y k is a scalar random variable, with zero mean and 
variance Rk- The notation of the incoming observation thus 
becomes y k , with the dimensio n m v = 1 . The algorithm 
description below mainly follows lAndersonl (|2007T l. 

Suppose that the i-th ensemble member x| ^ of X| con- 



sists of m x elements (x fc i )j (j — 1, • 



such that x fc 



[(x| • • • , (x| j)m x ] • Then the sample mean x^ of ~K k 



= x k,i/ n - To introduce covariance inflation to the fil- 



ter, suppose that AX£ = {Ax^ : Ax^ = x^ - k%}? =1 is 
the ensemble of deviations with respect to x£, and A ^ 1 
the inflation factor, then the inflated backgroun d ensemble 
is Xf- f = K"/ : x£/ = x£ + V\ Ax 6 fc ,jr=i (|Anderso4 



120071 : l2009h . With covariance inflation, X* fe " J and X£ have 
the same mean, but the sample covariance of Xj^ is A times 
that of X|. In what follows, we do not particularly distin- 
guish background ensembles with and without covariance 
inflation through different notations. Instead, we always de- 
note the background ensemble by X|, no matter whether it 
is inflated or not. One can tell whether a background ensem- 
ble is inflated by checking the value of A, e.g., A = 1 means 
no inflation, and A > 1 with covariance inflation. 

On the other hand, suppose that the projection of 



49 X fe onto the observation space is Y fe = {y k . 



H fc x b fc 



then one can compute the sample mean y k and 
sample variance p yy k as 



-6 

Vk 



Pyy,k 



1 - 

n 

1=1 

1 

— ^2(v b k,i 



(2) 



-6n2 

Vk) ■ 



With the incoming observation y k , one updates y k and p yVtk 
to their analysis counterparts, y k and p?,,, k , re spectively, 



through the following formulae ( Anderson! 120071 . Eq. (3.2 
3.3)). 



Pyy,k 



= liPm.k) 1 +Rk 1 }' 



Vk =Pyy,k[(Pyy,k 



1.6 . D -l 

yk + R k 



Vkl 



(3) 



1 In contrast, in residual nudging we are interested in adjust- 
ing the projection of the background mean. Comparison and/or 
combination of these two strategies will be deferred to future in- 
vestigations. 



158 Accordingly, one can update the projection Y k to its anal- 

159 ysis counterpart Y£ = {y kA : yZ,i = Vk,i + t>Vk,i}i=i> where 

160 the increments 6yk,i with respect to y k i are given by 



Syk,i — 



Pyy,k 

Pyy,k 



(Vk,i 



"6 \ , .-a b 

Vk) + Vk - Vk,i ■ 



(4) 
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162 One can verify that the sample mean and covariance of Y£ 

163 are y k and py y>k , respectively. Also note the difference be- 

164 tween the concepts of deviations and increments. For distinc- 

165 tion we have used A to denote deviations, and 5 increments. 

166 After the above quantities are calculated, one proceeds 

167 to update the background ensemble to the analysis one 

168 Xfe ee {Xfc_ 4 : = x b k ,i + <5x fe ,i}- l =1 , where the incre- 

169 ment 5x kl i with respect to the i-th background ensemble 

170 member x ki is an m x dimensional vector, i.e., 5~x.kj = 

171 [(5xfe,t)i, • • • , ((5x fc ,i) m;c ] T , where the j-th element (6xk,i)j 

172 of 5xk,i is given by 

173 (<5x fcii )j = (f xv , k /pty,k)Syk,i , j = 1, • • • , m x , (5) 

174 with fP xy k being the sample cross-variance between all the 

175 j-th elements of the ensemble members of , and the pro- 

176 jection ensemble = {y£ ;}" = i, i.e., 

1 ™ 

177 Pxy,k = [ 5Z[( x fc,0j - {^k)j][Vk,i - Vk] ■ (6) 

i=i 

178 With relatively small ensemble sizes, E g. Q often results i n 

179 spuriously large sample cross- variances (jHamill et al.l . l200ll ). 

180 To tack le this problem, one may introduce covariance local- 

181 ization l|Hamill et al.l . l200ll ) to the EAKF, in which the main 

182 idea is to multiply f? k in E q. $5$ by a "distance" -dependent 

183 tapering coefficient rjij «S 1 (|Andersonl 120071 ; l2009h . We will 

184 discuss how to compute r/ij in the experiments with respect 

185 to the L96 model. 

186 After obtaining the analysis ensemble X£, one computes 

n 

187 the analysis mean x£ = X] x fc,i/ n (analysis for short), and 

i = l 

188 uses it as the posterior estimate of the system state. Prop- 

189 agating X£ forward through the dynamical model, a back- 

190 ground ensemble at the next assimilation time is obtained, 

191 and a new assimilation cycle starts, and so on. 

192 2.2 Residual nudging 

193 As will be shown later, the EAKF may suffer from 

194 filter divergence in certain circumstances, even when it is 

195 equipped with both covariance inflation and localization. To 

196 mitigate filter divergence, intuitively one may choose to ad- 

197 just the estimate x£ and move it closer toward the truth 

198 x k r . In practice, though, xj, r is normally unknown, thus it is 

199 infeasible to apply this state-space based strategy. In what 

200 follows, we introduce a similar, but observation-space based 

201 strategy, in which the main idea is to monitor, and, if nec- 

202 essary, adjust the residual norm of the estimate. For this 

203 reason we refer to this strategy as residual nudging. 

204 By definition, the residual with respect to the analysis 

205 mean xJJ is r k = Hfcx£ — y k . We also define the 2- norm of a 

206 vector z as 

207 ||z|| 2 EE Vz T Z . (7) 

208 The objective in residual nudging is the following. We accept 

209 x^ as a reasonable estimate if its residual norm ||f^||2 is no 

210 larger than a pre-specified value, say, /3-\/trace(Rfc), with 

211 P > being called the noise level coefficient hereafter (the 

212 reason in choosing this pre-specified value will be explained 

213 soon). Otherwise, we consider x£ a poor estimate, and thus 

214 find for it a replacement, say, x^, based on the estimate xj 

215 and the observation y£, so that the residual norm of x^ is 



216 no larger than /3^/trace(Rfc). To this end, we stress that the 

217 assumption m y ^ m x may be necessary in certain cases (see 

218 the discussion later). In this work we focus on the cases with 

219 m y ^ m x , which is true for many geophysical problems. 

220 The objective of residual nudging can be achieved as 

221 follows. First of all, we compute a scalar c k £ [0, 1], called 

222 the fraction coefficient hereafter (cf. Eq. ()9a|) later for the 

223 reason), according to the formula 

224 c k =min(l,/Vtrace(Rfc)/||?fc||2), (8) 

225 where the function min(a, 6) finds the minimum between 

226 the scalars a and b. The rationale behind Eq. ([8]) is this: if 

227 ||r£||2 > /3\/trace(Rfc), then we need to multiply ||ffc||2 by a 

228 coefficient c k < 1 to reduce ||r£||2 to the pre-specified value. 

229 Otherwise, we do nothing and keep ||ffc||2 as it is, which is 

230 equivalent to multiplying ||ffc||2 by Ck = 1. 

231 Next, we construct a new estimate x£ by letting 

232 Xfc = c fc Xfc + (1 - c fe )xfe , (9a) 
HI x£ = HjftHjHjfrV* . (9b) 

235 The term (HfeH^ ) _1 in Eq. (J9bj) is the Moore-Penrose 

236 generalized inverse of H^, such that x£ in Eq. (|9b[l pro- 

237 vides a least-squa re solution for the equation H^x = y£ 

238 l|Engl et al.1 . |2000| . ch. 2). We refer to x£ as the observa- 

239 tion inversion hereafter. With Eq. @, the new residual 

240 it = H fc xg - yg = c k r%, so that ||f£|| a = c k \\r%h < 

241 /3-\/trace(Rfc) according to Eq. ([8]). 

242 In residual nudging we only attempt to adjust the anal- 

243 ysis mean x£ of the EAKF, but not its covariance. To this 

244 end, let the analysis ensemble be X£ = {x£ >4 : x k i = 

245 x^ + Ax^ J" =1 , where the deviations AxJ , = x£ ; - x£. We 

246 then replace the original analysis mean x£ by x£, and change 

247 the analysis ensemble to X£ = {x^j : xjj 4 = x^ + Ax^j}"^. 

248 Therefore, in comparison with the normal EAKF, the EAKF 

249 with residual nudging (EAKF-RN for short) just has addi- 

250 tional steps in Eqs. @ and©, while all the other procedures 

251 remain the same. In doing so, residual nudging is compatible 

252 with both covariance inflation and localization. 

253 2. 3 Discussion 

254 Choosing the pre-specified value in the form of 

255 /3-\/trace(Rfc) is motivated by the following consideration. 

256 Let x| r be the truth such that y% = HfcX^T + v k . Then 

257 r% = H fe x£ - y% = H fe (x^ - xf) - v fc , and by the triangle 

258 inequality, 

Hh< ||H»(xS-x£-)|| a + ||v*|| a . (10) 

260 For a reasonably good estimate x£, we expect that the mag- 

261 nitude of H^x^ — HfcxJ. r should not substantially exceed 

262 the observation noise level. On the other hand, we have 

263 (E||v fe || 2 ) 2 < E||v fc ||| = trace(E(v fc vjO) = trace(R fc ), thus 

264 the expectation E||vfc || 2 of the norm of the observation noise 

265 is (at most) in the order of -\/trace(Rfc). One may thus use 

266 \J trace(Rfc) to characterize the noise level. By requiring 

267 that a reasonably good estimate have ||Hfc(x£ — x^T) || 2 in 

268 the order of -\/trace(Rfc) (or less), one comes to the choice 

269 in the form of / 8^/trace(Rfe). The criterion in choosing the 

270 above threshold is very similar to that in certain quality con- 

271 trol algorithm s (called check of plausibility, see, for example 

272 iGandinl . 1 19881 . for a survey) , in which one is assumed to have 
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273 prior knowledge about, say, the mean y 3 and variance a 3 of a 

274 scalar observation y 3 . In quality control, y 3 is often assumed 

275 to be a Gaussian random variable, so that for a measured 

276 observation y°, if the ratio \y° — y s \/a a is too lar ge, then y° 

277 is discarded, or at least suspected |Gandinlll988l '). The main 

278 differences between residual nudging and quality control are 

279 the following. While quality control checks the plausibility of 

280 an incoming observation, residual nudging checks the plau- 

281 sibility of a state estimate, and suggests a replacement if the 

282 original state estimate does not pass the test. Moreover, as 

283 long as the 2-norm is used, the expectation E||vfc||| is al- 

284 ways trace (R*), independent of the distribution of vt,. This 

285 independence, on the one hand, implies that the inequality 

286 in (|10[) . hence the threshold /3y / trace(R*y, holds without re- 

287 quiring the knowledge of the distribution of Hfc(x^ — xj. r ). 

288 On the other hand, the absence of the knowledge of the dis- 

289 tribution means that less statistical information is gained in 

290 choosing the threshold /3i/trace(Rfc). For instance, one may 

291 not be able to assign a statistical meaning to /3-\/trace(Rfc), 

292 nor obtain a confidence (or significance) level in accepting 

293 (or rejecting) a state estimate. Finally, it is also possible 

294 for one to adopt another distance metric, e.g., the 1- or oo- 

295 norm, for which the inequality in (|I0|l still holds. In such cir- 

296 cumstances, the expectation, E||v||i or EHv^H^, may not be 

297 equal to trace(Rfc) any more, so that one may need to choose 

298 a threshold different from fiy/ trace(Rfc). Despite the stated 

299 differences, we expect that residual nudging can be used in 

300 conjunction with observation quality control, although this 

301 is not pursed in the current study. 

302 Even though the noise level coefficient j3 in residual 

303 nudging is chosen to be time- invariant, the resulting fraction 

304 coefficient Ck in general changes with time according to Eq. 

305 ||5J. The coefficient /3 affects how the new analysis x£ com- 

306 bines the original one x£ and the observation inversion x£. 

307 This can be seen from Eqs. ((HJ and (|9a[) . Because ct £ [0, 1], 

308 the new analysis x£ in Eq. (|9a|l is a convex combination of 

309 x^ and x£, i.e., an estimate somewhere in-between the origi- 

310 nal estimate x^ and the observation inversion x£, depending 

311 on the value of Ck . If one chooses a large value for /3, or, if for 

312 a fixed f3 the original residual norm r£ is sufficiently small, 

313 then the fraction coefficient Ck — >• I according to Eq. 

314 thus x5J — ¥ x£ according to Eq. ()9a|) . Therefore x£ will be 

315 a good estimate if x£ is so, but may not be able to achieve 

316 a good estimation accuracy when x£ itself is poor. On the 

317 other hand, if one chooses a very small value for /?, or, if for 

318 a fixed /3 the original residual norm r£ — > +oo (e.g., with fil- 

319 ter divergence), then Ck — > and x£ — > x£. In this case, the 

320 estimate xj is calculated mainly based on the information 

321 content of the observation y£, and may result in a relatively 

322 poor accuracy. This is largely because of (I) the presence of 

323 the observation noise v/t in Eq. fTJ, and (2) the ignorance 

324 of the prior knowledge of the model dynamics. As a result, 

325 pushing the projection of state estimates very close to noisy 

326 observations may have some negative consequences. For in- 

327 stance, in geophysical applications, dynamical balances of 

328 the numerical models may not be honored so that the es- 

329 timation errors may be relatively large. However, using x£ 

330 as the estimate may be a relatively safe (although conserva- 

331 tive) strategy against filter divergence. In the sense of the 

332 above discussion, the choice of j3 reflects the extent to which 

333 one wants to achieve the trade-off between a filter's poten- 



334 tial accuracy and stability against divergence. This point is 

335 further demonstrated through some experiments later. 

336 Some numerical issues related to the computation of the 

337 observation inversion x£ are discussed in order. One is the 

338 existence and uniqueness of the observation inversion. Under 

339 the assumptions that m y ^ m x and that Hfe is of full row 

340 rank, the observation i nversion , as a solution of the equation 

341 HfcX = y£, does exist (|Meverl . l200ll . ch. 4). Finding a con- 

342 crete solution, however, is in general an under-determined 

343 problem, hence the solution is not unique unless m y — m x . 

344 This point can be seen as follows. When m y < m x , the null 

345 space of Hk contains non-zero elements, i.e., the re exist 

346 eleme nts x„ £ S^, x„ ^ 0, such that H fe x„ = (|Meveij . 

347 l200ll . ch. 4). As a result, given an observation inversion x£, 

348 x£ + x„ is also a solution of the equation H^x = y£ for 

349 any x n £ S^. Therefore, which solution one should take is 

350 an open problem in practice. In the context of state esti- 

351 mation, it is desirable to choose a solution that is close to 

352 the truth x. k r , which, unfortunately, is infeasible without the 

353 knowledge of xj. r . As a trade-off, one may choose as a so- 

354 lution some estimate that possesses certain properties. The 

355 Moore-Penrose generalized inverse x£ given in Eq. ()9b|) is 

356 such a choice, which is the unique, and "best-approximate" , 

357 solution in the sense that it has the minimum 2-norm among 

358 all least-squares solutions (End et al. 1. 12000| . Theorem 2.5). 

359 It is also worth mentioning what may happen if our as- 

360 sumptions, that m y ^ m x and that is of full row rank, 

361 are not valid. In the former case, with m y > m x , the equa- 

362 tion HfcX = yl is over-determined, meaning that there may 

363 be no solution that solves the equation exactly. One may 

364 still obtain an approximate solution by recasting the prob- 

365 lem of solving the linear equation as a linear least-squares 

366 problem, which yields the unique, least-squares solution in 

367 the form of x£ = (H^ Hfc) _1 H^y^, similar to (but differ- 

368 ent from) Eq. (|9b[) . Because H^x^ — y£ may not be in 

369 general, one may thus not be able to find a new estimate 

370 Xfe with a sufficiently small (e.g., zero) residual. Therefore, 

371 the inequality [|f£||2 ^ /3\Arace(RA;) may not hold for some 

372 sufficiently small j3. This restriction is consistent with the 

373 nature of over-determined problems (that is, no exact so- 

374 lution). It does not necessarily mean that residual nudging 

375 cannot be applied to an over-determined problem, but in- 

376 stead implies that the noise level coefficient /3 should entail 

377 a lower bound that may be larger than 0. 

378 In the latter case, without loss of generality, suppose 

379 that m y ^ m x and Hfe is not of full row rank, then 

380 the matrix product HfeH^ is singular, so that it may be 

381 numerically unstable to compute its inverse. In such cir- 

382 cumstances, one needs to employ a certain regularization 

383 technique to obtain an approximate, but stable, solution. 

384 For instanc e , one may adopt the Tikhonov regularization 

385 (|Engl et all . boOfj . ch. 4) so that the solution in Eq. (j9bj 

386 becomes x% = H^(HfeH^ + al) -1 y£, where a is the reg- 

387 ularization parameter chosen according to a certain crite- 

388 rion. The observation inversion in Eq. ()9b|l can be treated 

389 as a special case of the Tikhonov regularization solution with 

390 a = 0, while the concept of residual nudging is also applica- 

391 ble to the general cases with a/0 following our deduction 

392 in § I2.2FI . In this sense, the state estimate of the EAKF-RN 

2 In general cases with a ^ 0, it can be shown that a sufficient 
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393 can be considered as a hybrid of the original EAKF estimate 

394 and the (regularized) least-squares solution of the equation 

395 HfcX = y£. This point of view opens up many other possi- 

396 bilities, given the various types of regularization te chniques 

397 in the literature (see, for example. lEngl et al.ll200oh . 

398 The computation of the matrix product H^(HfcH^) -1 

399 is a non-trivial issue in large-scale problems, and is wor- 

400 thy of further discussiorff]. In general cases where the ob- 

401 servation operator Hfc is time varying, the computational 

402 cost is comparable to that in evaluating the Kalman gain. 

403 In terms of numerical compu tations , one possible choice 

404 is to apply QR factorization (|Meveij . l200ll . ch. 5) to H£ 

405 such that is factorized as the product of an orthogonal, 

406 m x x m, matrix Q and an upper-triangular, m, x m y ma- 

407 trix U, where for notational convenience we drop the time 

408 index k in these matrices. Note that QQ T = Q T Q = Im x , 

409 and U = [U^ y ,0f m<e _ my)my ] T , with l m<c being the m x - 

410 dimensional identity matrix, 0( mx _ m ) m the (m, — m y ) x 

411 m y zero matrix, and U mj a non-singular, upper-triangular, 

412 m y x m y matrix in which all elements below the main di- 

413 agonal are zero. With some algebra, it can be shown that 

414 the product H^(HfcH^)- 1 = Q , (mx _ my)m J T = 

415 Qm I m f (U m 1 j ) T , where Q mx my is a matrix that is com- 

416 prised of the first m y columns of Q, and the inverse of 

417 the upper-triangular matrix U m can be computed element- 

418 by-element in a recursive way (called back substitution, 

419 iMeverllioOll . ch. 5). In certain circumstances, further reduc- 

420 tion of computational cost and /or storage c an be achieved, 

421 for instance, when Hfc is sparse (|Meverl . [200ll , ch. 5); or when 

422 Hfc is time invariant, e.g., in a static observation network. 

423 In the latter case, one only needs to evaluate the product 

424 H^(HfcH^) -1 once and for all. 



425 3 Numerical results in a linear scalar system 

426 Here we use a scalar, first order autoregressive (AR1) 

427 model driven by Gaussian white no ise, to investiga te the per- 

428 formance of the Kalman filter fKF lKaImarlll96Ch . and that 

429 of the KF with residual nudging (KF-RN), in which residual 

430 nudging is introduced to the posterior estimate of the KF in 

431 the same way as in the EAKF. The motivation in conducting 

432 this experiment is the following. With linear and Gaussian 

433 observations, the KF provides the optim al estimate in th e 

434 sense of, for instance, minimum variance (|jazwinskil . ll970l ). 

435 Therefore, we use the KF estimate as the reference to ex- 

436 amine the behaviour of the KF-RN under different settings, 

437 which reveals how residual nudging may affect the perfor- 

438 mance of the KF. 

439 The scalar AR1 model is given by 

440 Xk+1 = 0.9 x k + u k , (11) 

441 where Uk represents the dynamical noise and follows the 

condition to achieve residual nudging is, for example, Cfc(||r?||2 — 
l|y£lh) ^ /3\/tracc(R fe ) — ||y£ H2, with the (possibly) new estimate 

again given by Eq. H9al l, 
3 For the experiments to be presented later, since the dimensions 
of the dynamical models are relatively low, we choose to directly 
compute the matrix product Hj(Hj.Hj') — 1 . The matrix inver- 
sion (HfcH^)- 1 is done through the MATLAB (R2011b) built-in 
function INV. 



442 Gaussian distribution with zero mean and variance 1 , and is 

443 thus denoted by u k ~ N(u k : 0, 1). The observation model 

444 is described by 

445 yk — x k + v k , (12) 

446 where Vk ~ N(vk : 0, 1) is the observation noise, and is 

447 uncorrelated with Uk- 

448 In the experiment, we integrate the AR1 model forward 



449 for 10,000 steps (integration steps hereafter), with the ini- 

450 tial value randomly drawn from the Gaussian distribution 

451 N(0, 1), and the associated initial prior variance being 1. 

452 The true states (truth) {xk}i^i are obtained by drawing 

453 samples of dynamical noise from the distribution N(0, 1), 

454 and adding them to Xk to obtain Xk+i at the next inte- 

455 gration step, and so on. The synthetic observations y k axe 

456 obtained by adding to model states Xk samples of obser- 

457 vation noise from the distribution N(0, 1). For convenience 

458 of comparison, we generate and store synthetic observations 

459 at every integration step. However, we choose to assimilate 

460 them for every S a integration steps, with S a £ {1, 2, 4, 8}, in 

461 order to investigate the impact of S a on filter performance. 

462 In doing so, data assimilation with different S a , or other ex- 

463 periment settings (e.g., the noise level coefficient /3 in the 

464 KF-RN), will have identical observations at the same inte- 

465 gration steps. For convenience, hereafter we may sometimes 

466 use the concept "assimilation step", with one assimilation 

467 step equal to S a integration steps. In addition, we may also 

468 call S a the assimilation step when it causes no confusion. 

469 In the KF-RN, we also choose to vary the noise level 

470 coefficient /3, with /3 G {0.01, 0.05, 0.1, 0.5, 1, 2, 3, 4, 6, 8, 10}, 

471 in order to investigate its effect on filter performance. To 

472 reduce statistical fluctuations, we repeat the experiment 20 

473 times, each time with randomly drawn initial value, samples 

474 of dynamical and observation noise (so that the truth and 

475 the corresponding observations are produced at random). 

476 Except for the introduction of residual nudging, the KF-RN 

477 have the same configurations and experiment settings as the 

478 KF. 

479 We use the average root mean squared error (average 

480 RMSE) to measure the accuracy of a filter estimate. For 

481 an m^-dimensional system, the RMSE of an estimate 

482 Xfc = [xk,i, • • • , i*. rai ] T with respect to the true state vector 

483 Xfc" = [x t k 1 , ■ ■ ■ ,x k r m \ at time instant k is defined as 

484 e k = II Xfc - Xfe r || 2 /\/mI. (13) 

485 The average RMSE e k at time instant k over M repetitions 

486 of the same experiment is thus defined as ik = YLj=i e t/M 

487 (M = 20 in our setting), where e J k denotes the RMSE at 

488 time instant k in the jth repetition of the experiment. We 

489 also define the time mean RMSE e as the average of &k over 

490 the assimilation time window with N integration steps, i.e., 

491 e = J2 1 Li^k/N (N= 10000 here). 

492 We also use the spread to measure the estimated un- 

493 certainty associated with an estimation. To this end, let Pfc 

494 be the estimated covariance matrix with respect to the es- 

495 timate Xfc. Then the spread Sk at time instant k is defined 

496 as 



497 Sfc = \J trace(Pfc)/m :E . (14) 

498 The average spread s k and the time mean (average) spread 
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499 s are defined in a way similar to their counterparts with 

500 respect to the RMSE. 

501 Table [1] reports the time mean RMSEs and spreads of 

502 the KF at different assimilation steps S a - The time mean 

503 RMSE of the KF grows as S a increases, indicating that the 

504 performance of the KF deteriorates as the assimilation fre- 

505 quency decreases. The time mean spread of the KF exhibits 

506 a similar tendency as S a increases. However, the time mean 

507 spread tends to be larger than the time mean RMSE, indi- 

508 eating that the corresponding variance is over-estimated. 

509 Fig. [1] shows the time mean RMSEs of the KF-RN 

510 (dash-dot lines marked by diamonds), as functions of the 

511 noise level coefficient /3, at different assimilation steps S a - 

512 Given the different orders of magnitudes of /3, we adopt the 

513 logarithmic scale for the x-axes. For comparison, we also 

514 plot the time mean RMSEs of the KF (solid lines) at each 

515 S a - Since the time mean RMSEs of the KF are independent 

516 of the choice of /?, they are horizontal lines in the plots. 

517 However, the choice of /3 does influence the performance of 
sis the KF-RN. As shown in all of the plots of Fig. [T] if one 

519 adopts a small /?, say /3 = 0.01, for the KF-RN, then the 

520 resulting time mean RMSE is higher than that of the KF. 

521 This is because such a choice may force the KF-RN to rely 

522 excessively on the observations when updating the prior es- 

523 timates, such that the information contents in the prior esti- 

524 mates are largely ignored. As f3 grows, the time mean RMSE 

525 of the KF-RN decreases, and eventually converges to that of 

526 the KF when /3 is sufficiently large, say /3 > 3. These results 

527 are consistent with our expectation of the behaviour of a 

528 filter equipped with residual nudging, as has been discussed 

529 in § rj31 

530 It is also of interest to gain some insights of the be- 

531 haviour of the fraction coefficients Ck in the KF-RN with 

532 different /3. To this end, Fig. [5] plots two sample time series 

533 of Cfc in the KF-RN with p = 0.1 (upper left panel), and 

534 P — 1 (lower left panel), respectively, together with their 

535 corresponding histograms (right panels). For convenience of 

536 visualization, the assimilation time window is shortened to 

537 1000 steps (with the observations assimilated for every 4 

538 steps). At /3 = 0.1, Cfc tends to be relatively small, with the 

539 mean value being 0.4213 and the median 0.3027. Among 

540 the 250 Ck values, 210 of them are less than 1, meaning that 

541 residual nudging is effective at those steps. A histogram of 

542 Ck is also shown on the upper right panel. There it indicates 

543 that Ck distributes like a U-shape, with relatively large pro- 

544 portions of Ck taking values that are less than 0.2, or equal 

545 to 1. On the other hand, at /3 = 1, c/t tends to remain close 

546 to 1, with the mean being 0.9892 and the median 1, and 

547 only 16 out of 250 Ck values are less than 1. These are also 

548 manifested in the histogram on the lower right panel, where 

549 one can see that Ck largely concentrate on 1. 

550 In Table Q] we report the minimum time mean RMSEs 

551 that the KF-RN can achieve by varying the value of /3 at 

552 different S a , together with the values of the j3 at which the 

553 minima are obtained for specific S a - When S a = 1,2, the 

554 minimum time mean RMSEs of the KF-RN, both achieved 

555 at P = 2, are (very) slightly lower than the time mean RM- 
sse SEs of the KF; and the time mean RMSEs of the KF-RN 

557 become the same as those of the KF when /3 ^ 3. On the 

558 other hand, when S a = 4, 8, the minimum time mean RM- 

559 SEs of the KF-RN are identical to the time mean RMSEs of 

560 the KF, and are obtained when /3 ^ 2. The reason that the 



561 KF-RN can have lower time mean RMSEs than the "op- 

562 timal" KF at S a = 1,2 might be the following. The clas- 

563 sic filtering theory states that the KF is optimal und er the 

564 minimum variance (MV) criterion l|jazwins E Il970t ). that 

565 is, taking the mean of the posterior conditional pdf as the 

566 state estimate, the KF has the lowest possible expectation 

567 of squared estimation error. Note that here the expectation 

568 is taken over all possible values of the truth (i.e., by treating 

569 the truth as a random variable). Therefore, in principle one 

570 has to repeat the same experiment for a sufficiently large 

571 number of times (with randomly drawn truth) in order to 

572 verify the performance of the filters under the MV criterion. 

573 For computational convenience, though, we only repeat the 

574 experiment 20 times. Thus in our opinion the slight out- 

575 performance of the KF-RN might be largely attributed to 

576 statistical fluctuations. 

577 In Table [1] we do not present the time mean spreads of 

578 the KF-RN because they are in fact identical to those of the 

579 KF. This is because in the KF, the forecast and update of 

580 the (estimated) covariance matrix of the system state are 

581 not influenced b y the mean estimate of the system state 

582 l|jazwinskil . Il970l ) . Since residual nudging only changes the 

583 estimate of the system state (if necessary) and nothing else, 

584 it is expected that the KF and KF-RN share the same covari- 

585 ance matrix. This point, however, is not necessarily true in 

586 the context of ensemble filtering in a nonlinear system. For 

587 instance, if the dynamical model is nonlinear, then the back- 

588 ground covariance at the next assimilation time is affected 

589 by the analysis mean at the current time, such that two 

590 analysis ensembles with different sample (analysis) means 

591 but identical sample (analysis) covariance may result in dif- 

592 ferent sample (background) means and covariances at the 

593 next assimilation time. 

594 The above results suggest that it may not be very mean- 

595 ingful to introduce residual nudging to a Bayesian filter that 

596 already performs well. In practice, thoug h, due to the ex- 

597 istence of various sources of uncertainties (|Andersonl . 120071 : 

598 ILuo and Hoteitl . l201ll ) , a Bayesian filter is often sub-optimal , 

599 and is even likely to suffer from divergence (|Schlee et all 

600 Il967h . In such circumstances, instead of only looking into 

601 the accuracy of a filter, it may also be desirable to take 

602 the stability of the filter into account. Through the experi- 

603 ments below we show that equipping the EAKF with resid- 

604 ual nudging can not only help improve its stability, but also 

605 achieve a filter accuracy that is comparable to, sometimes 

606 even (much) better than, that of the normal EAKF, espe- 

607 cially in the small ensemble scenario. 



60s 4 Numerical results in the 40-dimensional L96 

609 model 

610 4-1 Experiment settings 

611 Here we use the 40-dim ensional Lorenz-96 (L96) model 

612 l|Lorenz and Emanuell . 1 19981 ) as the testbed. The governing 

613 equations of the L96 model are given by 

614 -J^ = (cci+i - Xi-2)xi-i - Xi + F, i = 1, • • • ,40. (15) 

615 The quadratic terms simulate advection, the linear term rep- 

616 resents in ternal dissipat ion, and F acts as the external forc- 

617 ing term ()Lorenzl . Il996l ). Throughout this work, we choose 
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618 F — 8 unless otherwise stated. For consistency, we define 672 

619 X-i = £39, xq = 240, and X41 = xi in Eq. (|15|l . and con- 673 

620 struct the state vector x = [xi,X2, ■ ■ ■ , £40] T . 

621 We use the fourth-order Runge-Kutta method to inte- 

622 grate (and discretize) the system from time to 75, with 

623 a constant integration step of 0.05. To avoid the transition 675 

624 effect, we discard the trajectory between and 25, and use 676 

625 the rest for data assimilation. The synthetic observation yj, 677 
is obtained by measuring (with observation noise) every d 678 



all experiments the normal EAKF and the EAKF-RN have 
identical configurations and experiment settings. 



626 

627 elements of the state vector xt 

628 time instant k, i.e., 



629 



[xk,i, Xk,2, ■ ■ • ,Xk,io] at 



y k = H x fc + v fc 



(16) 



630 where H is a (J + 1) x 40 matrix such that H Xfc = 683 

631 [xu,i, Xk,i+d, ■•■ ,Xk,i+.jd] T , with J = floor(39/d) being the 684 

632 largest integer that is less than, or equal to, 39/d, and Vfc 685 

633 is the observation noise following the Gaussian distribution 686 

634 N{wh : 0,lj+i), with Ij+i being the (J + l)-dimensional 687 

635 identity matrix. The elements (H d ) P9 of the matrix H d can 688 

636 be determined as follows. 689 



(H d ) P9 



1 if q = (p — l)d + 1 , otherwise (H ) I 



0. 



690 
691 



for p = 1, • • ■ , ( J + 1), q = 1, • • ■ , 40. In all the experiments 
below, we generate and store the synthetic observations at 
every integration step, but assimilate the observations for 
every 4 integration steps unless otherwise stated. 

The niters in the experiments are configured as fol- 
lows. To generate an initial background ensemble, we run 
the L96 model from to 2500 (overall 50000 integra- 
tion steps), and compute the temporal mean and covari- 
ance of the trajectorjQ. We then assume that the ini- 
tial state vectors follow the Gaussian distribution with the 
same mean and covariance, and draw a specified number 
of sample s to form the background ense mble. Covariance 
inflation dAnderson and Anderson] . 1 1999h and localization 



(|Hamill et all . 120011 ) are conducted in all the experiments. 
Concretely, covariance inflation, with the inflation factor A, 
is introduced following the discuss ion in § | 2.1l Covarianc e 
localization is conducted following lAndersonl (|2007l ; l2009f ). 
which introduces an additional pa rameter l c , call e d the 
length scale (or half-width following lAndersonl [20071 ; 120091 ') 
hereafter, to the EAKF. The distance dij between two state 
variables Xi and Xj are defined as dij = min(|i — j|/40, 1 — 
\i — j |/40), and the corresponding tapering coefficient r/ij 
(cf. the text below Eq. (O) is det ermined by the fift h -orde r 
polynomial function £,{dij,l c ) in iGaspari and Cohnl (|l999l ) 
with half- width l c . For dij < 2l c , one has < rji 3 1 ^ 1, and 
rjij = otherwise. With both covariance inflation and local- 
ization, the performance of the normal EAKF is in general 
comparable to the established results with respect to the 
L96 model unde r simil a r experimen t settin g, see, for exam- 
ple, |feti£eial] |20o3) ; HuMeTaD $200^). 

To reduce statistical fluctuations, we repeat each exper- 
iment below for 20 times, each time with randomly drawn 
initial state vector, initial background ensembles and obser- 
vations. Except for the introduction of residual nudging, in 



4 Let {xfej^j be a set of state vectors at different time instants 728 

which form a state trajectory from time instant 1 to N. Then the 729 

temporal mean and covariance of the trajectory arc taken as the 730 

sample mean and covariance of the set {xj,}^^ respectively. 731 



674 1^.2 Experiment results 

4-2.1 Results with different observation operators Here we 
consider four different observation operators H d , with d — 
1, 2, 4, 8, respectively. For convenience, we refer to them as 
the full, 1/2, 1/4 and 1/8 observation scenarios, respectively. 
The concrete configurations of the normal EAKF and the 
EAKF-RN are the following. In both filters the ensemble size 
is fixed to be 20. The half-width l c of covariance localization 
increases from 0.1 to 0.5, each time with an even increment 
of 0.1. For convenience we denote this setting by l c £ {0.1 : 
0.1 ; 0.5}. Similar notations will be frequently used later. 
The inflation factor A £ {1 : 0.05 : 1.25}, and the noise level 
coefficient (3 = 2 in the EAKF-RN. 

The upper panels of Fig. [3] shows the contour plots of 
the time mean RMSEs of the normal EAKF (left), and that 
of the EAKF-RN (right), in the full observation scenario, 
as functions of the inflation factor A and the half- width l c . 
Given a fixed A, the time mean RMSEs of both the EAKF 
and EAKF-RN tend to increase as the half-width l c in- 
creases. On the other hand, given a fixed Z c , when l c = 0.1, 
the time mean RMSEs of both filters exhibit the U-turn be- 
haviour, i.e., the time mean RMSEs tend to decrease as A 
grows, until it reaches a certain value (1.10 for both filters). 
After that, the time mean RMSEs will increase instead as A 
grows further. However, when l c > 0.1, the time mean RM- 
SEs of both filters tend to decrease as A increases within the 
range of tested A. The normal EAKF achieves its minimum 
time mean RMSE (0.5605) at the point (l c = 0.1, A = 1.10), 
and the EAKF-RN also hits its minimum time mean RMSE 
(0.5586) at the same place. In general, the EAKF and the 
EAKF-RN have similar performance at l c = 0.1, but at other 
places the EAKF-RN may perform substantially better than 
the EAKF. For instance, at (l c = 0.4, A = 1.05) the time 
mean RMSE of the normal EAKF is about 3.3, while that 
of the EAKF-RN is about 1.6. Moreover, a filter divergence 
is spotted in the normal EAKF at (l c = 0.3, A = 1.25), so 
that the contour plot around this point is empty and indi- 
cates no RMSE value. Filter divergence, however, is not ob- 
served in the EAKF-RN at the same place. For clarity, here 
a "divergence" is identified as an event in which the RMSE 
of a filter becomes abnormally large. More specifically, the 
filter is considered divergent in the Lorenz 96 model, if its 
RMSE at any particular time instant is larger than 10 3 . As 
mentioned previously, we repeat each experiment 20 times in 
order to reduce statistical fluctuations. In accordance with 
this setting, a filter divergence is reported whenever there is 
at least one (but not necessarily all) divergence(s) out of 20 
repetitions. 

In the 1/2 and 1/4 observation scenarios, there are 
many cases in which filter divergences are spotted. For this 
reason, we choose to directly report the assimilation results 
in Tables [2] and [3] respectively, rather than show their con- 
tour plots as in the full observation scenario. In the 1/2 ob- 
servation scenario, filter divergences of the normal EAKF, 
marked by "Div" in Table [2] are spotted in 24 out of 30 
different combinations of l c and A values (5 l c values by 
6 A values). In contrast, in the EAKF-RN no filter diver- 
gence is observed. On the other hand, when there is no fil- 
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732 ter divergence occurring in either filter, the performance of 

733 the EAKF and the EAKF-RN is very close to each other, 

734 with the time mean RMSEs of the EAKF-RN slightly lower 

735 than those of the EAKF, except at (l c = O.f , A = 1.15) and 

736 (l c — 0.1, A = 1.25). The situation in the 1/4 observation is 

737 similar. As shown in Table [3] the EAKF diverges in 17 out 

738 of 30 tested cases, while there is no filter divergence spotted 

739 in the EAKF-RN. The performance of the EAKF and the 

740 EAKF-RN is close to each other when the EAKF does not 

741 diverge. 

742 The lower panels of Fig. shows the contour plots of 

743 the time mean RMSEs of the normal EAKF (left), and that 

744 of the EAKF-RN (right), in the 1/8 observation scenario. In 

745 this scenario, no filter divergence is spotted in the EAKF. 

746 Overall, the performance of the EAKF and the EAKF-RN 

747 is very close to each other, although the EAKF-RN has a 

748 slightly lower minimum time mean RMSE (2.9556 achieved 

749 at {l c = 0.1, A = 1)) than that of the EAKF (2.9619 obtained 

750 at the same place). 

751 We then examine the impact of residual nudging on the 

752 time mean spreads of the filters in different observation sce- 

753 narios. For the full and 1/8 observation scenarios, we plot 

754 the time mean spreads of the EAKF and the EAKF-RN in 

755 Fig. 21 while for the 1/2 and 1/4 observation scenarios, we 

756 report them in Tables [2] and [3] in the parentheses after the 

757 RMSE values. In all the reported cases in which the EAKF 

758 does not diverge, the time mean spreads of the EAKF-RN in 

759 general do not significantly deviate from those of the EAKF. 

760 In cases that the EAKF does diverge, the EAKF-RN may 

761 still maintain positive and finite time mean spreads. The 

762 closeness of the time mean spreads of the EAKF and EAKF- 

763 RN in the former cases, though, may depend on the exper- 

764 iment settings, e.g., the choice of the noise level coefficient 

765 p. However, from our experience, as long as /3 is reasonably 

766 large (say ft ^ 2), the time mean spread of the EAKF-RN 

767 often approaches that of the EAKF. For brevity, hereafter 

768 we do not report the spread values any more. 

769 Overall, in both the normal EAKF and the EAKF-RN, 

770 their time mean RMSEs tend to increase as the number of 

771 elements in an observation decreases. The performance of 

772 the EAKF-RN, in terms of time mean RMSE, is in general 

773 comparable to, and sometimes (substantially) better than, 

774 that of the EAKF. Moreover, the EAKF-RN tends to per- 

775 form more stably than the EAKF. 



776 Results with different noise level coefficients Next 

777 we examine the effect of the noise level coefficient P on the 

778 performance of the EAKF-RN. The experiment settings are 

779 as follows. We conduct the experiments in four observation 

780 scenarios as in the previous experiment. The ensemble size 

781 of the EAKF-RN is 20. We choose the noise level coefficient 

782 p from the sets {0}, {0.02 : 0.02 : 0.1}, {0.2 : 0.2 : 1}, 

783 and {2,3,4,6,8}. The reason to single out P = will be 

784 given soon. Under the above setting, it is infeasible for us to 

785 adopt too many combinations of l c and A as in the previous 

786 experiment, either for presentation or computation. There- 

787 fore, we only choose two such combinations in the current 

788 experiment (similar choices will also be made in subsequent 

789 experiments, in which we can only afford to vary some of 

790 the parameter values, and have to freeze the rest). In the 

791 first combination we let l c = 0.1 and A = 1.15, and in the 



792 second l c = 0.3 and A = 1.05. From the previous experiment 

793 results, the former choice represents a relatively good filter 

794 configuration for the normal EAKF, while the latter a less 

795 proper one. We thus use these two configurations to illus- 

796 trate the effect of residual nudging when the normal EAKF 

797 has reasonable/ (relatively) poor performance. 

798 Fig. [S] depicts the time mean RMSEs of the EAKF-RN 

799 as functions of P in different observation scenarios, in which 

800 the relatively good filter configuration l c = 0.1 and A = 1.15 
sol is adopted. Due to different orders of magnitudes of P, the 

802 x-axes are all plotted in the logarithmic scale. For this rea- 

803 son, it is inconvenient to show the results of P — at log 

804 (= — oo). Instead, we plot the results at P — 0.005, and 

805 "artificially" label that point 0. The time mean RMSEs of 

806 the normal EAKF are independent of P, and are plotted as 

807 horizontal lines in the relevant sub-figures (if no filter di- 

808 vergence in the normal EAKF). In all observation scenarios, 

809 the time mean RMSEs of the EAKF-RN are relatively large 

810 at small P values (say P = 0.02). As p increases, the time 
en mean RMSEs of the EAKF-RN tend to converge to those 
812 of the normal EAKF. During the processes of convergence, 
en the minimum time mean RMSE of the EAKF-RN in the full 
8i4 observation scenario is lower than that of the normal EAKF, 
sis while the minimum time mean RMSEs of the EAKF-RN in 
8i6 other observation scenarios are either indistinguishable from 
si? (in the 1/2 and 1/4 observation scenarios), or slightly higher 
sis than (in the 1/8 observation scenario), those of the normal 

819 EAKF. 

820 Fig. [6] shows the time mean RMSEs of the normal 

821 EAKF and the EAKF-RN, with experiment settings similar 

822 to those in Fig.0 except that the covariance localization and 

823 inflation configuration becomes l c — 0.3 and A = 1.05, re- 

824 spectively, which, as will be shown below, makes the normal 

825 EAKF perform worse in comparison to the previous case in 

826 Fig. [5] 

827 With l c = 0.3 and A = 1.05, the resulting EAKF-RN 

828 behaves similarly to that with the previous configuration 

829 l c = 0.1 and A = 1.15. For the current filter configuration, 

830 though, as p grows, the time mean RMSEs of the EAKF- 

831 RN exhibit clear troughs in all observation scenarios. On the 

832 other hand, compared to the previous results in Fig. [5] the 

833 performance of the normal EAKF deteriorates in all obser- 

834 vation scenarios. Indeed, with the current filter configura- 

835 tion, the normal EAKF may perform (substantially) worse 

836 than the EAKF-RN under the same experiment settings, 

837 especially if a proper P value is chosen for the EAKF-RN. 

838 In particular, the normal EAKF diverges in the 1/2 (upper 

839 right) and 1/4 (lower left) observation scenarios, while no 

840 filter divergence is spotted in the EAKF-RN with P ^ 3, 

841 although the EAKF-RN does diverge in the 1/2 and 1/4 

842 observation scenarios, given P ^ 4. This suggests that one 

843 may increase the stability of the EAKF-RN against filter 

844 divergence by decreasing the value of P, so that Ck is closer 

845 to and the observation inversion becomes more influential 

846 in Eq. (|9a[). as we have discussed in i j2.3l 

847 It is also worth mentioning the behaviour of the EAKF- 

848 RN with small P values. As one can see in Figs. [5] and [6] 

849 given different filter configurations, the EAKF-RN may be- 

850 have quite differently at relatively large /? values. However, 

851 as P tends to 0, the time mean RMSEs of the EAKF-RN 

852 with different configurations tend to converge, despite the 

853 different combinations of l c and A. This is because, as P — > 0, 
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854 Cfc — > in Eq. {jJJ, hence the new estimate x£, according to 

855 Eq. (]9a|) . approaches the observation inversion x£, which is 

856 independent of, for instance, the half- width l c , the inflation 

857 factor A and the ensemble siz43- Since the time mean RMSE 

858 continuously depends on /?, it is not surprising to find that in 
ssg Figs. [5] and [61 the time mean RMSEs of the EAKF-RN with 

860 small p, say at /3 = 0.02, are very close to the corresponding 

861 values at ,8 = 0. 

862 More insights of the filters' behaviour may be gained 

863 by examining the fraction coefficient Ck in the EAKF-RN. 

864 For the relatively good filter configuration (l c — 0.1 and 

865 A = 1.15), we have seen in Fig. [5] that the EAKF and the 

866 EAKF-RN have very close performance, and our experiment 

867 results show that Ck mostly concentrate on 1, similar to the 

868 situations on the lower panels of Fig. [2] (not reported). Of 

869 more interest is the case in which the normal EAKF is less 

870 properly configured (Z c = 0.3 and A = 1.05), and may suffer 

871 from filter divergence. On the upper panels of Fig. [7| we 

872 show sample time series of the RMSEs of the normal EAKF 

873 and EAKF-RN (P = 2) in the 1/2 observation scenario. 

874 On the upper left panel, the EAKF has an exceptionally 

875 large RMSE (in the order of 10 21 ) at time step k = 26, is 

876 thus considered diverged. In contrast, on the upper right 
s?? panel, the EAKF-RN (j) = 2) has all the RMSEs less than 

878 5 (with the corresponding time mean RMSE being 1.8931), 

879 and filter divergence is avoided. The lower left panel shows 

880 the time series of the fraction coefficient Cfc, which has the 

881 mean 0.9499 and the median 1. Among 250 Ck values, 78 are 

882 less than 1. For reference, a histogram of Ck is plotted on the 

883 lower right panel, which confirms that Ck largely concentrate 

884 on 1. 

885 In Fig. [S] we also examine what happens before the nor- 

886 mal EAKF diverges. On the upper panel, we show the time 

887 series of the RMSEs of the EAKF (in the solid line with as- 

888 terisks) and the EAKF-RN (P = 2, in the dotted line with 

889 plus signs). One can see that, at the beginning, say, when the 

890 time instant k 5j 15, the difference between the EAKF and 

891 the EAKF-RN is relatively less significant. For 16 < k < 25, 

892 the difference becomes more obvious. On the middle panel 

893 we report the difference between the EAKF and the EAKF- 

894 RN (J3 = 2), in terms of the RMSE of the EAKF minus 

895 that of the EAKF-RN, for 1 sC k < 16. The reason for not 

896 including the RMSE differences at larger time instants is 

897 that their amplitudes are relatively large and may make rel- 

898 atively small values indistinguishable from 0, which is not 

899 desired for our purpose. On the lower panel, we also show 

900 the fraction coefficients Ck of the EAKF-RN (/3 = 2) for 

901 1 ^ k ^ 25. Note the availability of Ck depends on the avail- 

902 ability of the incoming observations, therefore Ck appear for 

903 every 4 steps only. Based on these figures, one may tell what 

904 happens to make the EAKF and EAKF-RN behave differ- 

905 ently. At time step k = 4, there is an incoming observation. 

906 However, because C4 = 1, the EAKF and EAKF-RN share 

907 identical estimates from k = 1 to k = 7. At k — 8, there is 

908 one more incoming observation, and this time cs is less than 

909 1, meaning that residual nudging is effective, so that there 

910 is a (very) small difference spotted between the estimates 



5 When the observation operator is time- varying, the assimilation 
step S a in general has an influence on the observation inversion, 
as S a decides when the observations are assimilated. 



911 of the EAKF and EAKF-RN. At k = 12, residual nudging 

912 is conducted again (but no more for subsequent steps up to 

913 k = 24), which, together with the previous residual nudging, 

914 makes the estimates of the EAKF-RN deviate from those of 

915 the EAKF, and eventually avoid filter divergence at k = 26. 

916 Overall, we have shown that, when the normal EAKF is 

917 properly configured, the performance of the normal EAKF 

918 and the EAKF-RN is in general comparable. However, if the 

919 EAKF is not configured properly, then the EAKF-RN may 

920 perform (substantially) better than the normal EAKF. For 

921 many large scale data assimilation problems, it may be very 

922 expensive to conduct an exte nsive parameter searching in 

923 order to configure the EnKF jAndersonl l2007T l. Should the 

924 EnKF be ill-configured, we expect that introducing residual 

925 nudging to the EnKF may enhance its performance, in terms 

926 of filter accuracy and/or stability against divergence. 



927 4.2.3 Results with different ensemble sizes Here we ex- 

928 amine the effect of the ensemble size n on the performance 

929 of the normal EAKF and the EAKF-RN. The experiment 

930 settings are as follows. We also conduct the experiment in 

931 four observation scenarios. The ensemble size n is chosen 

932 from the set {2, 4, 6, 8, 10, 20, 40, 60, 80}. In the experiment 

933 we fix Z c = 0.1 and A = 1.15 for both the normal EAKF and 

934 the EAKF-RN. In the EAKF-RN, we adopt two noise level 

935 coefficients, with p being 1 and 2, respectively. 

936 Fig. [9] shows the time mean RMSEs of the normal 

937 EAKF (solid lines with squares), and those of the EAKF- 

938 RNs with ft = 1 and 2 (dotted lines with bold points, and 

939 dash-dotted lines with crosses, respectively), in different ob- 

940 servation scenarios. In the full observation scenario, no filter 

941 divergence is found for all the ensemble sizes n in either 

942 filter. When n < 10, the EAKF-RN with /3 = 1 tends to 

943 perform better than the EAKF-RN with j3 = 2, while the 

944 latter is better than the normal EAKF. This is particularly 

945 the case with a relatively small ensemble size, say at n = 2. 

946 On the other hand, when n ^ 20, the time mean RMSEs of 

947 the three filters are almost indistinguishable. 

948 In the 1/2 observation scenario, the normal EAKF di- 

949 verges when n ^ 10, so there are no square markers appear- 

950 ing at those n values. The EAKF-RN with /3 = 2 appears 

951 more robust than the normal EAKF, although there is still 

952 a filter divergence spotted at n — 4. In contrast, the EAKF- 

953 RN with /3 = 1 is the most robust filter, which does not di- 

954 verge for all the tested ensemble sizes. In terms of time mean 

955 RMSE, though, when the filters do not diverge, the EAKF- 

956 RN with p = 1 tends to perform worse than the EAKF-RN 

957 with p — 2, while the latter appears to be indistinguishable 

958 from the normal EAKF for n ^ 20. 

959 The situations in the 1/4 and 1/8 observation scenarios 

960 are similar to that in the 1/2 one. In the 1/4 observation 

961 scenario, the normal EAKF diverges for n ^ 8, while the 

962 EAKF-RN appears to be more robust, except that there is 

963 a filter divergence at n — 4 for the EAKF-RN with P = 2. 

964 When n = 2, the EAKF-RN with P = 2 performs better 

965 than the filter with P = 1, but at n = 6 or 8, the filter with 

966 P = 1 performs better instead. For n 10, the performance 

967 of all three filters are almost indistinguishable. In the 1/8 

968 observation scenario, the normal EAKF and the EAKF-RN 

969 with P = 2 diverge at n = 2 and 4, while the EAKF-RN with 

970 P = 1 diverges only at n — 2. For n = 6 or 8, the EAKF- 
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971 RN with P = 1 has the best performance in terms of time 

972 mean RMSE, the EAKF-RN with (3 = 2 the second, while 

973 the normal EAKF the last. For n ^ 10, the performance of 

974 the three filters are almost indistinguishable, except that at 

975 n = 10, the time mean RMSE of the EAKF-RN with /3 = 1 

976 is slightly higher than those of the other two filters. 

977 The above results suggest that n — 20 appears to be a 

978 reasonable ensemble size for the normal EAKF in the L96 

979 model, since in all these four observation scenarios, the per- 

980 formance of the normal EAKF with n = 20 is very close to 

981 that with larger n values. As the ensemble size n decreases, 

982 the normal EAKF becomes more unstable. The performance 

983 of the EAKF-RN with (3 = 1 and 2 is almost indistinguish- 

984 able from the normal EAKF for n ^ 20. However, given 

985 smaller ensemble sizes, the EAKF-RN tends to perform bet- 

986 tor than the normal EAKF, in terms of both filter accuracy 

987 and stability against filter divergence. In particular, one may 

988 enhance the stability of the EAKF-RN by reducing the noise 

989 level coefficient /3, since as j3 — > 0, the time mean RMSEs 

990 of the EAKF-RN in different observation scenarios become 

991 independent of the ensemble size n, and approach the corre- 

992 sponding values at j3 = 0. This property may be of interest 

993 in certain circumstances, for instance, those in which, due 

994 to practical limitations, one can only afford to run an EnKF 

995 with a very small ensemble size, so that filter stability be- 

996 comes an important factor in consideration. 



997 4-2-4 Results with different assimilation steps and obser- 

998 vation noise variances Here we examine the effects of the 

999 assimilation step S a and the observation noise variance on 

1000 the performance of the normal EAKF and the EAKF-RN. 

1001 We assume that the observation noise covariance matrix 

1002 is in the form of 7I, where I is the identity matrix with a suit- 

1003 able dimension in different observation scenarios, and 7 > 

1004 is a real scalar. As a result, the variances of are 7 for all 

1005 variables in an observation vector, while the cross-variances 

1006 are all zero. The experiment settings are the following. The 

1007 ensemble size is 20, l c = 0.1 and A = 1.15 for both the 
loos normal EAKF and the EAKF-RN. The noise level coeffi- 

1009 cients f3 is 2 in the EAKF-RN. We conduct the experiment 

1010 in four different observation scenarios, and choose S a from 

1011 the set {1, 4, 8, 12}, and 7 from the set {0.01, 0.1, 1, 10, 50}. 

1012 The relatively large values of 7, say 7 = 10, 50, are used to 

1013 represent the scenario in which the quality of the observa- 

1014 tions is relatively poor. Here we assume that we know the 

1015 observation noise variance precisely, while in a subsequent 

1016 experiment we will consider the case in which the observa- 

1017 tion noise variance is mis-specified. 

1018 Figs. [TO] and [TJJ show the time mean RMSEs of the 

1019 normal EAKF and the EAKF-RN, respectively, in different 

1020 observation scenarios. In the full observation scenario (upper 

1021 left panels), for a fixed variance 7, the time mean RMSEs of 

1022 both the normal EAKF and the EAKF-RN tend to increase 

1023 as the assimilation step S a increases. On the other hand, 

1024 for a fixed S a , the time mean RMSEs of both filters appear 

1025 to be monotonically increasing functions of the variance 7. 

1026 With 7 = 0.01, 0.1, 1, the time mean RMSEs of the EAKF- 

1027 RN tend to be lower than those of the normal EAKF, while 

1028 with 7 = 10, 50, they are almost indistinguishable, meaning 

1029 that for relatively poor observation, the normal EAKF and 

1030 the EAKF-RN have almost the same performance in terms 



1031 of estimation accuracy, which appears to be also true in 

1032 other observation scenarios, as will be shown below. In terms 

1033 of filter stability, for S a — 8 and 12, the normal EAKF 

1034 diverges at 7 = 0.01 and 0.1, but the EAKF-RN avoids 

1035 filter divergences at all these places. 

1036 In the 1/2 observation scenario (upper right panels), for 

1037 a fixed variance 7, the time mean RMSEs of both the nor- 

1038 mal EAKF and the EAKF-RN also grow as the assimilation 

1039 step S a increases. However, for a fixed S a , the time mean 

1040 RMSEs of the two filters have behaviour different from that 

1041 in the previous observation scenario. For S a = 1, the time 

1042 mean RMSE of the normal EAKF is still a monotonically 

1043 increasing function of 7; for S a = 4, 8, the normal EAKF 

1044 diverges at 7 = 0.01 and 0.1, and has monotonically in- 

1045 creasing time mean RMSE for 7^1; for S a = 12, the time 

1046 mean RMSE of the normal EAKF achieves its minimum at 

1047 7 = 0.1 (slightly lower than that at 0.01), and thus exhibits 

1048 the U-turn behaviour, a phenomenon that is more visible 

1049 in the EAKF-RN. Indeed, for all tested S a values, the time 

1050 mean RMSEs of the EAKF-RN all have their minima at 

1051 7 = 0.1, rather than at 7 — 0.01. The normal EAKF and the 

1052 EAKF-RN have almost indistinguishable time mean RM- 

1053 SEs for 7^1. While the normal EAKF tends to perform 

1054 better than the EAKF-RN at 7 = 0.01 and 0.1 in terms 

1055 of time mean RMSE, it is more likely to suffer from filter 

1056 divergence (e.g., at S a = 4,8). This is an example of the 

1057 trade-off between filter accuracy and stability, as discussed 

1058 in £12.31 

1059 In the 1/4 observation scenario (lower left panels), for 

1060 a fixed assimilation step S a , the time mean RMSEs of both 

1061 the normal EAKF and the EAKF-RN again appear to be 

1062 monotonically increasing as 7 increases. For a fixed variance 

1063 7, though, the time mean RMSEs of both filters tend to 

1064 exhibit the U-turn behaviour, in which the minimum time 

1065 mean RMSE is achieved at S a — 4 (except for the filter 

1066 divergence in the normal EAKF at 7 = 0.01), rather than 

1067 at S a = 1. The normal EAKF and the EAKF-RN have 

1068 almost indistinguishable time mean RMSEs for 7 ^ 0.1. 

1069 At 7 = 0.01, though, the normal EAKF seems to perform 

1070 better than the EAKF-RN in terms of time mean RMSE. 

1071 However, filter divergences are spotted at (S a = 4, 7 = 0.01) 

1072 and (S a ~ 1,7 = 50), which are again avoided in the EAKF- 

1073 RN. 

1074 In the 1/8 observation scenario (lower right panels), the 

1075 quantitative behaviour of the two filters, as functions of S a 

1076 and 7, is almost the same as that in the 1/4 observation sce- 

1077 nario. The main differences are the following. The time mean 

1078 RMSEs of the normal EAKF and the EAKF-RN are almost 

1079 indistinguishable in all tested cases. Filter divergences are 

1080 spotted at S a — 1, with 7 = 1, 10 and 50, respectively, not 
ioai only in the normal EAKF, but also in the EAKF-RN. One 
1082 may, however, avoid these filter divergences in the EAKF- 
ioa3 RN by assigning to it a smaller /3, as some of the previous 
1084 experiment results have suggested. 

loss Overall, the above experiment results are consistent 

1086 with our discussion in § 12.31 When equipped with residual 

1087 nudging, the EAKF-RN appears to be more stable than the 
loss normal EAKF, although maybe at the cost of some loss of es- 
ioa9 timation accuracy in certain circumstances (e.g., when with 
1090 too small p values). 
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1091 Results with imperfect models and mis-specified ob- 

1092 servation error covariances Finally, we examine filter per- 

1093 formance of the normal EAKF and the EAKF-RN when 

1094 they are subject to uncertainties in specifying the forcing 

1095 term F in Eq. I|15p and the observation error covariance 

1096 Rfc. We again conduct the experiments in four observation 

1097 scenarios. The ensemble sizes of both filters are 20. The half- 

1098 width l c of covariance localization is 0.1, and the covariance 

1099 inflation factor A is 1.15. The true value of F is 8, while 
noo the true observation error covariance Ra, is I20- In the ex- 
noi periments we let the value of F in the (possibly) imperfect 

1102 model be chosen from the set {4, 6, 8, 10, 12}, and the (pos- 

1103 sibly) mis-specified covariance in the form of 7I20, with 

1104 7 G {0.25, 0.5, 1,2,5, 10} 0- In the EAKF-RN the noise level 

1105 coefficient (8 = 2. 

1106 Figs. H2l and ll3l show the time mean RMSEs of the nor- 

1107 mal EAKF and the EAKF-RN, respectively, as functions of 

1108 the (possibly) mis-specified driving force F and the obser- 

1109 vation noise variance 7, in different observation scenarios. 

1110 In the full observation scenario (upper left panels), for a 

1111 fixed 7, the time mean RMSEs of both filters exhibit the 

1112 U-turn behaviour with respect to F, achieving their minima 

1113 at F = 8. This point also appears to be valid in other ob- 

1114 servation scenarios. On the other hand, for a fixed F, the 

1115 behaviour of the filters is very similar to that reported in 

1116 Figs. [S] and [5] since the role of the (possibly) mis-specified 

1117 variance 7 is similar to the observation noise level coeffi- 

1118 cient f) (note, though, that 7 also appears in the computa- 

1119 tion of the Kalman gain). When 7 is relatively small (say 

1120 7 < 2), the EAKF-RN tends to perform better than the 

1121 normal EAKF in terms of time mean RMSE. Moreover, the 

1122 normal EAKF diverges at (F = 12, 7 = 0.25), while the 

1123 EAKF-RN avoids the divergence. On the other hand, when 

1124 7 is relatively large (say 7^6), the EAKF-RN and the nor- 

1125 mal EAKF have almost indistinguishable performance, not 

1126 only for the current experiment results, but also for those in 

1127 the other observation scenarios. This is largely because mis- 

1128 takenly over-estimating the variance 7 has an effect similar 

1129 to increasing j3, so that the observation inversion in Eq. (|9a[) 

1130 becomes less influential for state estimation, and the EAKF- 

1131 RN has almost the same estimate as the normal EAKF. 

1132 In the 1/2 observation scenario (upper right panels), 

1133 when 7 is relatively small (say 7^1), the normal EAKF 

1134 tends to diverge for all F. The EAKF-RN avoids filter diver- 

1135 gences in some of the areas, though there are still two cases 

1136 spotted at F = 12, with 7=1 and 7 = 2, respectively. As 7 

1137 becomes larger, the performance of the normal EAKF and 
ins the EAKF-RN are very close to each other, similar to the 

1139 situation in the full observation scenario. In both the 1/4 

1140 and 1/8 observation scenarios (lower panels), there are also 

1141 almost no differences between the time mean RMSEs of the 

1142 two filters, although the time mean RMSE of the EAKF-RN 

1143 appears to be slightly lower than that of the normal EAKF 

1144 in the 1/4 observation scenario for relatively small F and 7 

1145 (around the lower left corners). Both filters diverge in the 

1146 1/4 observation scenario, at (F = 10, 7 = 0.25), otherwise 

1147 neither filter diverges. 



6 The (possibly) mis-specified observation error covariance, in the 
form of 7I2O; is used for both background update, as described in 
£12.11 and residual nudging through Eq. JSJl. 
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i«8 5 Discussion and conclusion 

1149 In this work we proposed an auxiliary technique, called 

1150 residual nudging, for ensemble Kalman filtering. The main 

1151 idea of residual nudging is to monitor, and if necessary, 

1152 adjust the residual norm of a state estimate. In an under- 

1153 determined state estimation problem, if the residual norm is 

1154 larger than a pre-specified value, then we reject the estimate 

1155 and replace it by a new one whose residual norm is equal to 

1156 the pre-specified value; otherwise we accept the estimate. We 

1157 discussed how to choose the pre-specified value, and demon- 

1158 strated how one can construct a new state estimate based 

1159 on the original one and the observation inversion, given a 

1160 linear observation operator. 

1161 Through the numerical experiments in both the scalar 

1162 AR1 and the Lorenz 96 models, we showed that, by choos- 

1163 ing a proper noise level coefficient, the ensemble adjustment 

1164 Kalman filter with residual nudging (EAKF-RN) in general 

1165 works more stably than the normal EAKF, while achiev- 

1166 ing an accuracy that is often comparable to, sometime even 

1167 (much) better than that of the normal EAKF, especially if 

1168 the normal EAKF is ill-configured. This may occur, for in- 

1169 stance, when the EAKF is equipped with improperly chosen 

1170 covariance inflation factor and/or half- width of covariance 

1171 localization, too small ensemble size, and so on. In many 

1172 data assimilation practices, it may be very expensive to con- 

1173 duct extensive searching for proper inflation factor and/or 

1174 half-width, or to run a large scale model with too many 

1175 ensemble members. In such circumstances, we expect that 

1176 residual nudging may help improve the filter performance, 

1177 in terms of filter stability, and even accuracy. 

1178 We also implemented residual nudging in some other 

1179 filters, includi n g the stochastic ensemble Kalman filter 

1180 (|Burgers et all Il998h and the singular evo l utive interpo- 

1181 lated Kalman filter (SEIK) (|Hoteit et all |2002| ; iPhaml . 

1182 l200ll ). and observed similar performance improvements (not 

1183 shown in this work) . Since residual nudging only aims to ad- 

1184 just the estimates, we envision that residual nudging can be 

1185 associated with other data assimilation approaches, includ- 

1186 ing, for instance, the extended Kalman filter, the particle 

1187 filter, and various smoothers. This will be verified elsewhere. 

1188 One problem not addressed in this work is the nonlin- 

1189 earity of the observation operator. In such circumstances, we 

1190 conjecture that the rule in choosing the pre-specified value 

1191 /3-\/trace(Rfc) may still be applicable. However, the con- 

1192 struction of new state estimates would become more com- 

1193 plicated than Eqs. © and @. One possible strategy is to 

1194 linearize the observation operator, or employ more sophisti- 

1195 cated method s, such as itera t ive searching algorithms (see, 

1196 for e xample, iGu and Oliverl 1200 7l ; lLorentzen and Naevdall 

1197 l201lh . to find new estimates whose residual norms are no 

1198 larger than J S^/trace(Rj ; ). This is another topic that will be 

1199 investigated in the future. 
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Table 1. Time mean RMSEs and spreads of the KF, and the minimum time mean RMSEs (over different /3) of the KF-RN, in the AR1 
model with different S a - The KF and KF-RN have identical time mean spreads, therefore only those of the KF are presented. In the 
bottom row we also report the ranges of f) in which the minimum time mean RMSEs of the KF-RN are achieved. 



KF 




Sa 






1 


2 


4 


8 


RMSE 


0.6184 


0.8260 


1.0592 


1.2997 


spread 


0.7729 


1.0413 


1.3419 


1.8241 


KF-RN 




Sa 






1 


2 


4 


8 


min RMSE 


0.6183 


0.8259 


1.0592 


1.2997 


achieved at 


= 2 


/3 = 2 


P > 2 


£ > 2 
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Table 2. Time mean RMSEs (spreads) of the normal EAKF and the EAKF-RN in the 1/2 observation scenario, as functions of the 
covariancc inflation factor and the half-width of covariancc localization. 



EAKF 


l c = 0.1 


l c = 0.2 


l c = 0.3 


l c = 0.4 


l c = 0.5 


A = 


1.00 


1.0721 (0.7049) 


Div 


Div 


Div 


Div 


A = 


1.05 


1.0091 (0.7457) 


Div 


Div 


Div 


Div 


A = 


1.10 


0.9789 (0.7868) 


Div 


Div 


Div 


Div 


A = 


1.15 


0.9662 (0.8209) 


Div 


Div 


Div 


Div 


A = 


1.20 


0.9515 (0.8566) 


Div 


Div 


Div 


Div 


A = 


1.25 


0.9623 (0.8929) 


Div 


Div 


Div 


Div 


EAKF-RN 


l c = 0.1 


l c = 0.2 


l c = 0.3 


l c = 0.4 


lc = 0.5 


A = 


1.00 


1.0325 (0.7002) 


1.8256 (0.5697) 


2.1099 (0.5127) 


2.2734 (0.4736) 


2.2964 (0.4579) 


A = 


1.05 


1.0051 (0.7419) 


1.4072 (0.6185) 


1.9879 (0.5644) 


2.1821 (0.5269) 


2.2468 (0.5050) 


A = 


1.10 


0.9598 (0.7842) 


1.2313 (0.6553) 


1.8517 (0.6030) 


2.0342 (0.5699) 


2.1742 (0.5470) 


A = 


1.15 


0.9673 (0.8201) 


1.2024 (0.6870) 


1.6507 (0.6388) 


1.9317 (0.6015) 


2.0953 (0.5845) 


A = 


1.20 


0.9474 (0.8565) 


1.1788 (0.7183) 


1.5776 (0.6680) 


1.9059 (0.6336) 


2.0806 (0.6098) 


A = 


1.25 


0.9650 (0.8935) 


1.1856 (0.7484) 


1.5315 (0.6945) 


1.7778 (0.6603) 


2.0071 (0.6383) 
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Table 3. As in Table [2] except that it is in the 1/4 observation scenario. 



EAKF 


l c = 0.1 


l c = 0.2 


l c = 0.3 


l c = 0.4 


l c = 0.5 


A = 


1.00 


2.0685 (1.57.30) 


Div 


Div 


Div 


Div 


A = 


1.05 


1.990s (1.7849) 


Div 


Div 


Div 


Div 


A = 


1.10 


Cs Anno I c\ r\ A A T\ 

2.0226 (2.U447J 


2.3014 (1.5640) 


Div 


Div 


Div 


A = 


1.15 


2.0819 (2.3592) 


2.2174 (1.7254) 


2.9502 (1.5820) 


Div 


Div 


A = 


1.20 


2.1903 (2.6869) 


2.1839 (1.9468) 


2.7534 (1.7191) 


Div 


Div 


A = 


1.25 


2.3586 (3.0392) 


2.2596 (2.2340) 


2.6413 (1.8780) 


Div 


Div 


EAKF-RN 


i c = 0.1 


l c = 0.2 


l c = 0.3 


l c = 0.4 


l c = 0.5 


A = 


1.00 


2.0840 (1.5689) 


2.6099 (1.1984) 


3.0267 (1.0110) 


3.0453 (0.8703) 


3.0469 (0.7899) 


A = 


1.05 


2.0042 (1.7790) 


2.3341 (1.3762) 


2.8493 (1.1936) 


3.0573 (1.0403) 


3.1015 (0.9618) 


A = 


1.10 


1.9860 (2.0339) 


2.2976 (1.5332) 


2.8154 (1.3484) 


3.0527 (1.2112) 


3.1251 (1.1028) 


A = 


1.15 


2.0766 (2.3648) 


2.2389 (1.7244) 


2.7737 (1.4940) 


3.1247 (1.3341) 


3.2583 (1.2558) 


A = 


1.20 


2.1886 (2.6948) 


2.2312 (1.9710) 


2.6566 (1.6824) 


3.0992 (1.5048) 


3.2340 (1.3674) 


A = 


1.25 


2.3436 (3.0359) 


2.2352 (2.2344) 


2.6168 (1.8427) 


3.0977 (1.6509) 


3.2897 (1.5098) 
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Figure 1. Time mean RMSEs of the KF and the KF-RN as functions of the noise level coefficient in the AR1 model, with different S a - 
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Figure 2. Left panels: Sample time series of the fraction coefficients of the KF-RN with /3 = 0.1 (upper) and /3 = 1 (lower), respectively. 
Right panels: The corresponding histograms of the fraction coefficient time series. 
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Figure 3. Time mean RMSEs of the normal EAKF and the EAKF-RN, as functions of inflation factor and half-width, in the full and 
1/8 observation scenarios. 
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Figure 4. Time mean spreads of the normal EAKF and the EAKF-RN, as functions of inflation factor and half-width, in the full and 
1/8 observation scenarios. 
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Figure 5. Time mean RMSEs of the normal EAKF and the EAKF-RN as functions of the noise level coefficient in different observation 
scenarios, with A = 1.15 and l c = 0.1. 
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Figure 6. As in Fig. [5] but with A = 1.05 and l c = 0.3 for both the filters. Note that in the 1/2 and 1/4 observation scenarios divergences 
of the normal EAKF are spotted, hence no horizontal lines are indicated in the corresponding plots. The EAKF-RN also diverges in the 
1/2 and 1/4 observation scenarios for /3 ^ 4. 



© 0000 Tellus, 000, 000-000 



24 BY X. LUO AND I. HOTEIT 




Time step Value of the fraction coefficient 

Figure 7. Upper left: sample time series of the RMSE of the normal EAKF in the 1/2 observation scenario; Upper right: sample time 
series of the RMSE of the EAKF-RN (/3 = 2) under the same experiment settings as the EAKF; Lower left: corresponding fraction 
coefficient in the EAKF-RN (/3 = 2); Lower right: corresponding histogram of Ck- 
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Figure 8. Upper: the RMSE of the EAKF (solid line with asterisks) and EAKF-RN (/3 = 2, dotted line with plus signs) between the 
time instant k = 1 and k = 25; Middle: difference in the RMSE ( = RMSE of the EAKF - RMSE of the EAKF-RN) between k = 1 and 
k = 16; Lower: the fraction coefficient of the EAKF-RN (/3 = 2) between k = 1 and k = 25. 
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Figure 9. Time mean RMSEs of the EAKF and the EAKF-RN, as functions of the ensemble size in different observation scenarios. 
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Figure 10. Time mean RMSEs of the normal EAKF, as functions of the assimilation step S a and the observation noise variance, in 
different observation scenarios. 
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Figure 11. As in Fig.[L0] but for the EAKF-RN with ,9 = 2. 
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Figure 12. Time mean RMSEs of the EAKF, as functions of the (possibly) mis-specified driving force F and the observation noise 
variance 7, in different observation scenarios. 
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Figure 13. As in Fig. [[2] but for the EAKF-RN with ,9 = 2. 
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